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Abstract 

In this paper we consider two populations whose generations are not overlapping and whose size 
is large. The number of males and females in both populations is constant. Any generation is 
replaced by a new one and any individual has two parents for what concerns nuclear DNA and a 
single one (the mother) for what concerns mtDNA. Moreover, at any generation some individuals 
migrate from the first population to the second. 

In a finite random time T, the mtDNA of the second population is completely replaced by the 
mtDNA of the first. In the same time, the nuclear DNA is not completely replaced and a fraction 
F of the ancient nuclear DNA persists. We compute both T and F. Since this study shows that 
complete replacement of mtDNA in a population is compatible with the persistence of a large 
fraction of nuclear DNA, it may have some relevance for the Out of Africa/Multiregional debate 
in Paleoanthropology. 
Pacs: 87.23.Kg, 05.40.-a 
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INTRODUCTION 



Mitochondrial DNA (mtDNA) is inherited in a haploid manner through females. Since 
its mutation rate is high and can be easily measured, mtDNA is a powerful tool for tracking 
matrilineages and it has been widely used in this role by molecular biologists. On the 
contrary, nuclear DNA is inherited by both parents and it recombines at any generation. 
We show in this paper that haploid reproduction allows for a complete replacement of the 
mtDNA of a population by the mtDNA of immigrants. On the contrary, diploid reproduction 
allows for some of the ancient nuclear DNA to persist. 

We consider two interbreeding populations whose generations are not overlapping and 
whose size is large and constant in time. The number of males and females is the same and 
it is constant both in the first and in the second population. Any generation is replaced 
by a new one and any individual has two parents for what concerns nuclear DNA and a 
single one (the mother) for what concerns mtDNA. One of the two populations, that we call 
the African population, produces some emigrants at any generation (2p on average). The 
second population, that we call the Asian population, receives these people as immigrants. 
The size of the two populations is not necessarily the same and we assume that the number 
of African females is M while the number of Asian females is N, so that the total number 
of individuals in the two populations is 2N + 2M. 

Let us now explain how reproduction and migration are modeled. We assume that any 
individual in the new generation chooses independently the two parents at random in the 
previous one (see 

ESQ). 

The choice of an individual is independent on the choice of 
the others. Moreover, the Africans always choose among Africans while Asians choose with 
probability 1 — p/N among Asians and with probability p/N among Africans. The choice 
is neutral, i.e. there is not preferred choice among African individuals as well there is not a 
preferred choice among Asian ones. The choice of an African parent with probability p/N 
is equivalent to a migration of 2p Africans on average, one half of which, still on average, 
are females. Remark that the number of emigrants remains finite even if the population 
becomes very large (N — > oo). 

If the migration rate p vanishes, the mtDNA of a population can not be transmitted to 
the other. In this case, both the African mtDNA and the Asian one separately undergo 
to standard coalescence (see jj, ^| and more recently 0,0] for dynamical aspects). On the 
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contrary, if migration is allowed, we show that in a finite random time T the mtDNA of the 
Asian population is completely replaced by the mtDNA of immigrants (time is the number of 
generations divided by N). We also show that, in the same time, a fraction F of the nuclear 
DNA persists in the Asian population. In other words, the mtDNA of the Asian population 
living at a time T before present completely disappeared in the present population while a 
fraction F of the ancient nuclear DNA still persists (hereafter the 'ancient nuclear DNA' is 
the nuclear DNA of the Asian population living TN generations before present). Complete 
mtDNA replacement together with nuclear DNA persistence occur even if the migration rate 
is very low. 

We find the random replacement time T in the second section and the fraction F of ancient 
nuclear DNA in the third. In the last section we discuss the eventual relevance of our results 
for the debate about Out of Africa and Multiregional models in Paleoanthropology. The 
mathematical core of the paper is the Appendix where we show that the fraction of ancient 
nuclear DNA is a deterministic quantity which decreases exponentially in time. This is the 
reason why the ancient nuclear DNA can be diluted by the nuclear DNA of immigrants but 
it cannot be totally replaced. 

REPLACEMENT TIME 

Let us start with the following remark: since Africans always choose among Africans, 
African mtDNA undergoes to standard coalescence. The average coalescence time for African 
females is 2M/N which means 2M generations (the probability density for coalescence time 
can be found, for example, in (6|). On the contrary, present Asian female population may 
have mtDNA ancestors both in the Asian and in the African population. Assume that at 
a given time in the past the number of Asian mtDNA female ancestors is n. Going back- 
ward for one generation, the probability that this number decreases to n — 1 is where 
b(n) = pn + n(n — l)/2. The term ^ is due to the probability that one of the female ances- 
tors choose an African mother and the term n ( n ~ 1 )/ 2 [ s d ue the probability that two of the 
female ancestors choose the same Asian mother (the celebrated coalescence phenomenon). 
Than, the probability that the number of ancestors remains the same going backward for 
tN generations is [1 — ^jf-] tN which, for large N, becomes exp(—b(n)t). Therefore, the time 
t n needed for reducing from n to n — 1 the number of Asian female ancestors is exponen- 
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tially distributed with average [pn + n(n — l)/2] 1 . Finally, the time needed for complete 
replacement, i.e. for the present Asian population to have not Asian mtDNA ancestors, is 

oo 

r = E*i (i) 

3=1 

which is a sum of independent random times exponentially distributed and with parameters 
Remark that the above sum starts from one (not from two as in the coalescent) 
since complete replacement occurs when the number of Asian ancestors vanishes. 
The average replacement time is than 

<T>=±{ JP + 3 -^^]- 1 (2) 

3=1 Z 

which is plotted in the figure. 

It should be remarked that the number of emigrants which is requested for the rapid 
replacement of mtDNA is very small. A couple of immigrants at any generation {p = 1) in a 
large population of size iV allows for a complete replacement of the Asian mtDNA in about 
2N generations, while, if the couples are two (p — 2), it is sufficient 60% of the time. 

ANCIENT NUCLEAR DNA 

We have shown that the mtDNA of the Asian generation at a time T in the past (the 
ancient generation) completely disappeared in present Asian population. We want to see 
now what happened to the nuclear DNA of that generation (the ancient DNA). 

Let us first remark that the fraction of ancient DNA for the ancient generation equals 
1, while, at any following generation, this fraction is less than 1. In fact, at any generation 
replacement, both the father and the mother are independently chosen with probability 
p/N among Africans and with probability 1 — p/N among Asians. Since nuclear DNA 
of an individual comes for one half from the father and for one half from the mother, 
at any generation replacement, the fraction of ancient nuclear DNA in the entire Asian 
population is reduced, on average, by a factor (1 — -j^). The non averaged factor, indeed, 
randomly fluctuates around this value. Fluctuations are due to the fact that the number 
of immigrants is random and are also due to the Wright-Fisher diffusion associated with 
generation replacement. Fluctuations of both origins are of order 1/N as it is shown in the 
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FIG. 1: Average of mtDNA replacement time (full line) and average of ancient nuclear DNA 
fraction (dashed line) versus emigration rate p. 



Fluctuations are small because diploid reproduction is able to rapidly span ancient and 
new genes in all population. This self-averaging behavior is at very variance with haploid 
reproduction where a single individual only may have new mtDNA or ancient one. 

After a time T from the ancient generation (i.e. in the present population), the average 
fraction of ancient nuclear DNA reduces to (1 — f^) TN , which, for large N becomes exp(— pT). 

For each generation replacement, random fluctuations are of the order 1/N and, since 
they are uncorrelated, after TN generations they are of order l/y[N) (see the Appendix). 
Therefore, the fraction of ancient DNA in present generation is exactly F = exp {—pT) when 
the population is large and randomness is only due to the randomness of the time T. It must 
be remarked that F is never vanishing, since the random time T is finite with probability 
one. This implies that nuclear DNA is never completely replaced, at variance with mtDNA. 
As already remarkedjthis self-averaging behavior in the large N limit is typical of diploid 



Appendix. 




Indeed, self-averaging, which is proved in the Appendix, is the 



5 



key point of our results since the persistence of a fraction of ancient nuclear DNA, when 
mtDNA is totally replaced, is a direct consequence of it. 

In order to see how F depends on p we take now the average of F with respect to T and 
we obtain 

, /t-i\ 2n» + n(n-l) 

< F >=< exp f-pT) >= TT — — K — — '— 3 

1 y y 1 ^ 2{n + l)p + n(n - 1) V ; 

which is also plotted in the figure. 

As already remarked, p = 1 allows for a complete replacement of the Asian mtDNA in 

about 2 A generations, while for p = 2 a number 1.2 A of generations is sufficient. In the first 

case the average fraction of ancient nuclear DNA is 0.2 while in the second is 0.12. The point 

is to compare the average replacement time < T > with African average coalescence time 

2M/N. Assuming that African population size is equal or larger than Asian population size 

(2M > 2 A) one finds that in both the above examples < T > is equal or smaller than 2M/N. 

In other words, one or two couples of immigrants at any generation, allow for a replacement 

of Asian mtDNA in a time smaller than African coalescence time. Furthermore, they allow 

for the persistence of a significant fraction of ancient nuclear DNA. 



DISCUSSION 



Let us now discuss some possible consequences concerning Paleoanthropology. Assume 
that N = M = 5000 (number of African and Asian women), in this case, the average 
number of generations requested for coalescence of African mtDNA is 2M = 10, 000 which, 
assumed generations of 20 years, correspond to 200, 000 years. A couple of immigrants at 
any generation (p — 1) induces complete replacement of the Asian mtDNA in the same time, 
while, for two couples (p — 2), the time requested is 120,000 years. As already discussed, 
the fraction of ancient nuclear DNA ranges between 0.12 and 0.2. Moreover, if N is smaller 
than M = 5000, the migration rate p requested for having < T > < 2M/N can be smaller 
than 1 and, therefore, the fraction of ancient nuclear DNA can be higher, up to a maximum 
of 0.5. 

In conclusion, mtDNA argument cannot be used to prove 'Out of Africa' theory (see 
for a review) or to disprove Multiregional Model (see [H| for a review) since a very small 
migration flux is compatible both with pre-African nuclear DNA persistence and complete 
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pre-African mtDNA replacement in Asia and Europe. Indeed, the picture in this paper is 
compatible with 9] , where a study of worldwide human nuclear DNA seems to show repeated 
migrations form Africa to Europe and Asia. 

Finally, we would like to mention, that y-chromosome is also inherited in an haploid 
manner, the only difference is that its reproduction is driven by males. The qualitative and 
quantitative arguments in the paper remain unchanged if y-chromosome is considered in 
place of mtDNA. 
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APPENDIX 



Let us define TOj(t) as the fraction of ancient nuclear DNA for the male individual i 
at tN generations after the ancient generation and let also define the analogous fi(t) for 
the female individual i. By definition the fraction of ancient DNA for an individual of the 
ancient generation equals one, than we have mj(0) = fi(0) — 1. In the reproductive process 
an individual receives the nuclear DNA of both parents so that his fraction of ancient nuclear 
DNA will be an average of the fraction of the two parents. Indeed, this hold exactly only 
for large genomes as the human one. The link between two generations is than provided by 
the following stochastic equations 

= 2^*(*) m i(*.*)(* ~ e ) + /*(*>*)(* ~ e )] ( 4 ) 
Where e = 1/JV. The variables t) and k(i, t) take any integer value between 1 and N with 
equal probability 1/JV. The variables Hi{t) and (j)%{t) take the values with probability p/N 
and 1 with probability 1 —p/N. With our choice the father contribution \n%{t) rrij^^it — e) 
to the fraction rrii(t) vanishes with probability p/N (African father) and equals one half 
of the fraction of a given Asian father with probability (1 —p/N)/N. The same for the 
mother contribution. All variables Hi(t), 4>i(t), j(i,t), and k(i,t), are mutually independent, 
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furthermore, two variables of same type are independent whenever the individual indexes or 
time indexes are different (for example t) and j(k, s) are independent when i ^ j and/or 
t±s). 

An analogous stochastic equations holds for fi(t) with all variables in it independent from 
the mirror variables in 

Since < /i«(t) > = < <t>i{t) > = (1 — p/N) and since both j(i,t) and k(i,t) take any 
integer value between 1 and N with equal probability l/N, we obtain by averaging (0J) 

1 v N 

< mm >= — (1 - |) £(< m{t - e) > + < f^t - e) >) (5) 

Since averages are independent on the individual index, and averages for females and 
males must coincide (< rrii(t) > = < fi(t) >), we have 

< mm >= (1 - < Mt - e) >= (1 - p N (6) 

where the second equality is obtained by iteration and by the initial condition mj(0) = 1. 

Analogously, < [^(t)? > =< [<&(*)] 2 > = and < mi(t)fi(t) > =< mtyfjit) > 

= < mi(t)mj(t) > when i ^ j as it can be easily verified using Than we have 



< K(t)] 2 >=-(!-£-)< - e )f > +-(1 - < mt{t - e) m,(t - e) > (7) 

where it is intended that i ^ j and were we have again made use of the independence of 
averages on the individual index, and of the fact that averages for females and males coincide 

(< [mm? > = < m)} 2 >) 

Since e = l/N, this equality can hold only if 

< K(t)] 2 >=< mm mj(t) > +o(j^) (8) 

where o(jr) means 'of order 1/N\ 

This is the key point since we will use it to prove that ancient nuclear DNA fraction 
behaves deterministically in large populations. It is important to remark that the above 
equality is associated to diploid reproduction. In fact, (jSJ is a direct consequence of the fact 
that nuclear DNA is an average of that of both parents as described by equation Q. For 
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haploid reproduction an analogous of equation (jlj) holds, but the contribution comes only 
from a single parent and equality (jHJ) cannot be stated. 

Let us now define the fraction of ancient nuclear DNA in a population as the mean of 
the fraction of the component individuals 

1 N 

*(*) = ^ EM*) + /*(*)] (9) 
then, from © we immediately obtain 

< x(t) >= (i - (io) 

We will show now that x(t) is a deterministic variable in large population and, there- 
fore, coincides with its average. Using again individual index symmetries and male/female 
symmetry we obtain from (J3J) and (JHJ) 

< [x{t)f >= (1 - |p) 2 < [x(t - e)f > +R(t - e) (11) 

where 

R(t) = 2^(1 - |f )(< M*)] 2 > -(1 - < M*) ™i(*) >) (12) 

Using equality (jHJ) we obtain R(t) = o(-jka). Indeed this is the key of the proof, in fact, 
for haploid DNA transmission, we would obtain the same equation (jll)) but we would have 
R(t) = o(jj) in place of R(t) = o(-^j). 

From equation (|TT|) together with the condition R(t) = o(^y) we obtain 

< ix(t)? >= (i - ^r N + (i) (is) 

which compared with (fTUj) tells us that 

I(t) = (1 -^ r±0( 7^ ) (14) 

We remark that for haploid reproduction, fluctuations are of order 1 (o(l) in place of 
) , that is why mtDNA disappears in a finite random time, even for large populations. 



WW 

Finally, in the large iV limit, we obtain from (|13|) x(t) = exp(-pt), which tells us 
that ancient nuclear DNA fraction decreases exponentially. Moreover, since the random 
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replacement time is finite with probability one, the fraction F = x(T) = exp (—pT) is 
always finite. 
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